[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

PikachuHyA · 2024-12-20T06:42:22Z

While following the structure of the original code generation, I overlooked the getTypeInfoHelper function, which resulted in unions not being handled appropriately. see #1220 .
In this patch, I have implemented the getTypeInfoHelper function and marked enum and bitint as NYI.

For further details, please refer to the specific commit: llvm/llvm-project@a5986b9#diff-9964470fc240f2670aab82f768ff846ce19ea37a06a47cd7b4599fa3ef92165cL109.

Pull Request: llvm#872

@bcardosolopes

1. Add new `cir.vtt.address_point` op for visiting the element of VTT to initialize the virtual pointer. 2. Implement `getVirtualBaseClassOffset` method which provides a virtual offset to adjust to actual virtual pointers in virtual base. 3. Follows the original clang CodeGen scheme for the implementation of most other parts. @bcardosolopes's note: this is cherry-picked from an older PR from Jing Zhang and slightly modified for updates: applied review, test, doc and operation syntax. It does not yet has LLVM lowering support, I'm going to make incremental changes on top of this. Any necessary CIR modifications to this design should follow up shortly too. Also, to make this work I also added more logic to `addImplicitStructorParam`s` and `buildThisParam`.

as title. The generated code is the same as Clang codeden except in a small discrepancy when GEP: OG generates code like this: `%6 = getelementptr inbounds <4 x i16>, ptr %retval.i, i32 1` CIR generates a bit differently: `%6 = getelementptr <4 x i16>, ptr %retval.i, i64 1` Ptr offest might be trivial because choosing i64 over i32 as index type seems to be LLVM Dialect's choice. The lack of `inbounds` keyword might be an issue as `mlir::cir::PtrStrideOp` is currently not lowering to LLVM:GEPOp with `inbounds` attribute as `mlir::cir::PtrStrideOp` itself has no `inbounds`. It's probably because there was no need for it though we do have an implementation of [`CIRGenFunction::buildCheckedInBoundsGEP` ](https://github.com/llvm/clangir/blob/10d6f4b94da7e0181a070f0265d079419d96cf78/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L2762). Anyway, the issue is not in the scope of this PR and should be addressed in a separate PR. If we think this is an issue, I can create another PR and probably add optional attribute to `mlir::cir::PtrStrideOp` to achieve it. In addition to lowering work, a couple of more works: 1. Did a little refactoring on variable name changing into desired CamelBack case. 2. Changed neon-misc RUN Options to be consistent with other neon test files and make test case more concise.

…#951) as title. There are two highlights of the PR 1. The PR introduced a new test file to cover neon intrinsics that move data, which is a big category. This would the 5th neon test file. And we're committed to keep total number of neon test files within 6. This file uses another opt option instcombine, which makes test LLVM code more concise, and our -fclangir generated LLVM code would be identical to OG with this. It looks like OG did some instcombine optimization. 2. `getIntFromMLIRValue` helper function could be substituted by [`mlir::cir::IntAttr getConstOpIntAttr` in CIRGenAtomic.cpp](https://github.com/llvm/clangir/blob/24b24557c98d1c031572a567b658cfb6254f8a89/clang/lib/CIR/CodeGen/CIRGenAtomic.cpp#L337). The function `mlir::cir::IntAttr getConstOpIntAttr` is doing more than `getIntFromMLIRValue`, and there is FIXME in the comment, so not sure if we should just use `mlir::cir::IntAttr getConstOpIntAttr`, either is fine with me.

…ly (llvm#961) Close llvm#957 the previous algorithm to convert a multiple dimension array to a tensor is: fill the value one by one and fill the zero values in conditions. And it has some problems handling the multiple dimension array as above issue shows so that the generated values are not in the same shape with the original array. the new algorithm here is, full fill the values ahead of time with the correct element size and full fill the values to different slots and we only need to maintain the index to write. I feel the new version has better performance (avoid allocation) and better readability slightly.

…o llvm intrinsic (llvm#960) This PR refactored Neon Built in code in clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp a bit to make it cleaner. Also changed RUNOption of test file clang/test/CIR/CodeGen/AArch64/neon-arith.c to make test more concise, and easy to compare against OG (to compare, just remove -fclangir from llvm gen part of RUN, and the test should still pass)

This is the following up fix for the previous fix llvm#961 See the attached new test for the reproducer. Sorry for the initial overlook.

- The flag is the default even for cc1, so make it disable two level deep. - While here, remove the unnecessary flag disable for pure `-emit-cir`.

…nv-lowering While here, add more unrecheables to cover some of the current errors, so that our users can see a clear message instead of a random cast assert of sorts. This covers at least all crashes seen when removing -fno-clangir-call-conv-lowering from all tests, there are probably other things we'll find as we exercise this path.

These are not meant to be used by any other component, make sure it's very specific.

This is the usual copy-paste-modify from CodeGen, though I changed all the variable names to conform to our new style. All these functions should be pulled out as common helpers when we're upstream.

llvm/llvm-project@1d0bd8e moves a conditional from CodeGen to AST, and this follows suit for consistency. (Our support for the Microsoft ABI is NYI anyway; this is just to make things simpler to follow when matching up logic between CodeGen and CIRGen.)

There is no change to testing functionality. This refacot let those files have the same Run options that is easier to maintain and extend.

…#976) Close llvm#975 See the attached test case for example

This PR adds initial support for the `__int128` type. The `!cir.int` type is extended to support 128-bit integer types. This PR comes with a simple test that verifies the CIRGen and LLVM lowering of `!s128i` and `!u128i` work. Resolve llvm#953 .

…ll (llvm#982)

…ulhq_lane and vqrdmulh_lane (llvm#985)

…d (in CIR + Direct to LLVM) (llvm#966) Fixes llvm#931 Added type definition in CIRTypes.td, created appropriate functions for the same in CIRTypes.cpp like getPreferredAlignment, getPreferredAlignment, etc. Optionally added lowering in LowerToLLVM.cpp

… pipeline This is causing lots of churn. `-fclangir-call-conv-lowering` is not mature enough, assumptions are leading to crashes we cannot track with special messages, leading to not great user experience. Turn this off until we have someone dedicated to roll this out.

While here add some bits for ptr auth and match OG.

…lvm#990) LLVM's verifier enforces this, which was previously causing us to fail verification. This is a bit of a band-aid; the overall linkage and visibility setting flow needs some work to match the original.

This would facilitate implementation of neon intrinsic `neon_vmax_v` and `__builtin_elementwise_max`, and potentially future optimizations. CIR BinOp supports vector type. Floating point has already been supported by FMaxOp.

This option creates and links all tools against a single libLLVM shared library (and the corresponding CLANG_LINK_CLANG_DYLIB option also gets turned on by default). In order for this to work, we need to use LINK_COMPONENTS instead of LINK_LIBS for all LLVM dependencies, and clang_target_link_libraries for all Clang dependencies, so that they get rewritten to use the dylib. Remove llvm_update_compile_flags while I'm here, since the build macros handle that internally. Before this change, we'd link against certain LLVM libraries both statically and dynamically, leading to test failures from duplicate singletons. The way this works for MLIR is fragile right now: MLIR can create its own dylib as well but doesn't have build system support for linking against that dylib. We end up folding the MLIR libraries into libclang-cpp.so (because all Clang dependencies get pulled into it), but MLIR tools still link the MLIR libraries statically. It'll still work, but BUILD_SHARED_LIBS is possibly a better alternative for development. Distributions like Fedora build their LLVM packages with LLVM_LINK_LLVM_DYLIB, so we'll want to eventually have good MLIR support for that setup too.

…lvm#1203) C/C++ functions returning void had an explicit !cir.void return type while not having any returned value, which was breaking a lot of MLIR invariants when the CIR dialect is used in a greater context, for example with the inliner. Now, a C/C++ function returning void has not return type and no return values, which does not break the MLIR invariant about the same number of return types and returned values. This change keeps the same parsing/pretty-printed syntax as before for compatibility.

llvm#1225)

…llvm#1230) References: [OG's implementation](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13509) [Builtin's definition ](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=vrsra_n_)

Combined implementaiton with `neon_vaddlvq_u16` OG somehow implemented them separately but they are no different except signess and intrinsic name [OG's neon_vaddlvq_s16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13483) [OG's neon_vaddlvq_u16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13449)

The module-level uwtable attribute controls the unwind tables for any synthesized functions, and the function-level attribute controls them for those functions. I'll add support for this attribute to the LLVM dialect as well, but translate it from CIR directly for now to avoid waiting on the MLIR addition and a subsequent rebase.

llvm#1232) Match `CodeGenModule::SetLLVMFunctionAttributesForDefinition` so that we can see what's missing and have a good base to build upon.

…1233)

I am working on a clangir based solution to improve C++'s safety (https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245). This is similar with the previous analysis only approach I proposed, where we may not care about the lowered code. And this is what I described as layering problems in https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245 This is similar with the other issue proposed llvm#1128. We'd better to emit the higher level operations and lowering/optimizing it later. This is also inspired our handling method for VAArg, where we use ABI information to lower things during the passes. It gives me more confidence that I am doing things right.

The PR should help us to get rid of NYI `NYI UNREACHABLE executed at clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:899` [Relevant OG code here](https://github.com/llvm/clangir/blob/7fb608d4d1b72c25a1739a1bd66c9024208819cb/clang/lib/CodeGen/CGExpr.cpp#L4767): I put `HasExplicitObjectParameter` support as a missing feature, which is a new C++23 feature.

…ge (llvm#1214) #### The Problem Let's take a look at the following code: ``` struct A { ~A() {} }; int foo() { return 42; } void bar() { A a; int b = foo(); } ``` The call to `foo` guarded by the synthetic `tryOp` looks approximately like the following: ``` cir.try synthetic cleanup { %2 = cir.call exception @_Z3foov() : () -> !s32i cleanup { cir.call @_ZN1AD1Ev(%0) : (!cir.ptr<!ty_A>) -> () extra(#fn_attr1) // call to destructor of 'A' cir.yield } cir.yield } catch [#cir.unwind { cir.resume }] cir.store %2, %1: !s32i, !cir.ptr<!s32i> // CIR verification error ``` The result of the `foo` call is in the `try` region - and is not accessible from the outside, so the code generation fails with `operand #0 does not dominate its use` . #### Solution So we have several options how to handle this properly. 1. We may intpoduce a new operation here, like `TryCall` but probably more high level one, e.g. introduce the `InvokeOp`. 2. Also, we may add the result to `TryOp`. 3. The fast fix that is implemented in this PR is a temporary `alloca` where we store the call result right in the try region. And the result of the whole `emitCall` is a `load` from the temp `alloca`. So this PR is both the request for changes and an open discussion as well - how to handle this properly. So far I choose the third approach. If it's ok - I will need to create one more PR with a similar fix for the aggregated results or update this one.

This PR puts for-loop body, while-loop body, and do-while-loop body in nested scopes. Allocas in the loop body are now push down to the nested scope. Resolve llvm#1218 .

There are two sets of intrinsics regarding Min and Max operations for floating points [Maximum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaximum-llvmmaximumop) vs [Maxnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaxnum-llvmmaxnumop) [Minimum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminimum-llvmminimumop) vs [Minnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminnum-llvmminnumop) [The difference is whether NaN should be propagated when one of the inputs is NaN](https://llvm.org/docs/LangRef.html#llvm-maximumnum-intrinsic) Maxnum and Minnum would return number if one of inputs is NaN, and the other is a number, But Maximum and Minimum would return NaN (propagation of NaN) And they are resolved to different ASM such as [FMAX](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAX--vector---Floating-point-Maximum--vector--?lang=en) vs [FMAXNM](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAXNM--vector---Floating-point-Maximum-Number--vector--?lang=en) Both have user cases, we already implemented Maxnum and Minnum But Maximum and Minimum has user cases in [neon intrinsic ](https://developer.arm.com/architectures/instruction-sets/intrinsics/vmax_f32 ) and [__builtin_elementwise_maximum ](https://github.com/llvm/clangir/blob/a989ecb2c55da1fe28e4072c31af025cba6c4f0f/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp#L53)

…lvm#1235) Use iterator to visit std::initializer_list field reduce the readability

…d neon_vaddvq_f64 (llvm#1238) [Neon intrinsic definition](https://developer.arm.com/architectures/instruction-sets/intrinsics/vaddv_f32). They are vector across operation which LLVM doesn't currently have a generic intrinsic about it. As a side note for brainstorm, it might be worth in the future for CIR to introduce Vector Across type operations even though LLVM dialect doesn't have it yet. This would help to expose opt opportunities. E.g. a very trivial constant fold can happen if we are adding across a constant vector.

…#1239) This implementation is different from OG in the sense we chose to use CIR op which eventually lowers to generic LLVM intrinsics instead of llvm.aarch64.neon intrinsics But down to the ASM level, [they are identical ](https://godbolt.org/z/Gbbos9z6Y).

…vm#1242) This patch follows llvm#1220 (comment) by augmenting `CIR_Type` with a new field, `tbaaName`. Specifically, it enables TableGen support for the `-gen-cir-tbaa-name-lowering` option, allowing for the generation of `getTBAAName` functions based on the `tbaaName`. This enhancement enables us to replace the hardcoded TBAA names in the `getTypeName` function with the newly generated `getTBAAName`.

This PR adds a bitcast when we rewrite globals type. Previously we just set a new type and it worked. But recently I started to test ClangIR with CSmith in order to find some run time bugs and faced with the next problem. ``` typedef struct { int x : 15; uint8_t y; } S; S g = { -12, 254}; int main() { printf("%d\n", g.y); return 0; } ``` The output for this program is ... 127 but not 254! The reason is that first global var is created with the type of struct `S`, then `get_member` operation is generated with index `1` and then after, the type of the global is rewritten - I assume because of the anon struct created on the right side in the initialization. But the `get_member` operation still wants to access to the field at index `1` and get a wrong byte. If we change the `y` type to `int` we will fail on the verification stage. But in the example above it's a run time error! This is why I suggest to add a bitcast once we have to rewrite the global type.

…2/u32, (llvm#1240) Co-authored-by: Bruno Cardoso Lopes <bruno.cardoso@gmail.com>

bcardosolopes · 2024-12-20T17:39:17Z

clang/lib/CIR/CodeGen/CIRGenTBAA.cpp

+    if (!codeGenOpts.PointerTBAA) {
+      return cir::TBAAScalarAttr::get(mlirContext, types.ConvertType(qty));
+    }
+    llvm_unreachable("NYI");


The same problems we discussed about llvm_unreachable before is valid for the whole implementation of TBAA, please move to missing features for here, elsewhere and any future work

bcardosolopes · 2024-12-20T17:51:27Z

Please make this only change enough to fix #1246, seems to be adding more than necessary for now.

lanza and others added 30 commits November 22, 2024 18:23

[CIR][CodeGen] Support static references to temporaries

3a2fc4b

Pull Request: llvm#872

[CIR][CIRGen][Builtin][Neon] Lower vld1_dup and vld1q_dup (llvm#936)

0668d29

[CIR] [Lowering] Fix handling of multiple array for ZeroAttr (llvm#970)

5f9b3b8

This is the following up fix for the previous fix llvm#961 See the attached new test for the reproducer. Sorry for the initial overlook.

[CIR][Driver] Fix -fclangir-call-conv-lowering behavior

cb888fa

- The flag is the default even for cc1, so make it disable two level deep. - While here, remove the unnecessary flag disable for pure `-emit-cir`.

[CIR] Add more user facing messages for -fno-clangir-call-conv-lowering

7c851c3

[CIR][NFC] Massively rename workarounds for callconv lowering

dd9423f

These are not meant to be used by any other component, make sure it's very specific.

[CIR][CIRGen] Implement CIRGenModule::shouldEmitFunction (llvm#984)

761c564

This is the usual copy-paste-modify from CodeGen, though I changed all the variable names to conform to our new style. All these functions should be pulled out as common helpers when we're upstream.

[CIR][LowerToLLVM] Lower cir.vtt.address_point

8a12125

[CIR][NFC] Rename function in a test

26a1cae

[CIR][CIRGen] Get more vtable and dtor working

d8fc11b

[CIR][CIRGen][NFC] Consolidate RUN lines for builtin tests (llvm#968)

1091422

There is no change to testing functionality. This refacot let those files have the same Run options that is easier to maintain and extend.

[CIR] [Lowering] care trailing zero for lowering constant array (llvm…

11cbb68

…#976) Close llvm#975 See the attached test case for example

[CIR][CIRGen][Builtin][Neon] Lower neon_vrshrn_n to llvm intrinsic ca…

bafe969

…ll (llvm#982)

[CIR][CIRGen][Builtin][Neon] Lower vqdmulhq_lane, vqdmulh_lane, vqrdm…

526f3b7

…ulhq_lane and vqrdmulh_lane (llvm#985)

[CIR][NFC] Fix some consistency issues with missing features

4183dde

[CIR][CIRGen][Builtin] Implement builtin addressof (llvm#987)

0522f59

[CIR][NFC] Improve documentation about goto's

f6f02d1

[CIR][NFC] Move code around to match OG

cb384db

[CIR][CIRGen][NFC] Add more skeleton for handling inheritance ctors

8c92d88

While here add some bits for ptr auth and match OG.

[CIR][CIRGen] Ensure default visibility for local linkage functions (l…

da80c19

…lvm#990) LLVM's verifier enforces this, which was previously causing us to fail verification. This is a bit of a band-aid; the overall linkage and visibility setting flow needs some work to match the original.

ghehg and others added 23 commits December 12, 2024 09:41

[CIR][Dialect] Add BinOpKind_Max (llvm#1201)

994ceb5

This would facilitate implementation of neon intrinsic `neon_vmax_v` and `__builtin_elementwise_max`, and potentially future optimizations. CIR BinOp supports vector type. Floating point has already been supported by FMaxOp.

[CIR][CIRGen][Builtin][Neon] Lower neon_vaddvq_u32 and neon_vaddvq_u64 (

177d89b

llvm#1225)

[CIR][CIRGen] Match skeleton for setCIRFunctionAttributesForDefinition (

1156c06

llvm#1232) Match `CodeGenModule::SetLLVMFunctionAttributesForDefinition` so that we can see what's missing and have a good base to build upon.

[CIR][ThroughMLIR][NFC] BinOpKind_Max NYI ThroughMLIR lowering (llvm#…

a989ecb

…1233)

[CIR] Put loop body in nested scopes (llvm#1221)

49edd4b

This PR puts for-loop body, while-loop body, and do-while-loop body in nested scopes. Allocas in the loop body are now push down to the nested scope. Resolve llvm#1218 .

[CIR][CIRGen][TBAA] Add support for scalar types (llvm#1220)

c94c04e

[CIR][CodeGen][NFC] centralized split std::initializer_list field (l…

d3193f8

…lvm#1235) Use iterator to visit std::initializer_list field reduce the readability

[CIR][NFC] Silence warning from recent PR

2ec1a24

[CIR][CIRGen][Builtin][Neon] Lower neon_vaddvq_s64/s32, neon_vaddv_s3…

363440c

…2/u32, (llvm#1240) Co-authored-by: Bruno Cardoso Lopes <bruno.cardoso@gmail.com>

[CIR][CIRGen][TBAA] Initial TBAA support for union, enum and bitint

34dc621

PikachuHyA requested review from lanza and bcardosolopes as code owners December 20, 2024 06:42

PikachuHyA mentioned this pull request Dec 20, 2024

[CIR][CIRGen][TBAA] Struct with Union crashes during CodeGen #1246

Closed

bcardosolopes changed the title ~~[CIR][CIRGen][TBAA] Initial TBAA support for union, enum and bitint~~ [CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint Dec 20, 2024

bcardosolopes requested changes Dec 20, 2024

View reviewed changes

lanza force-pushed the main branch from 9ffbe92 to 3c3b096 Compare January 27, 2025 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

PikachuHyA commented Dec 20, 2024

bcardosolopes Dec 20, 2024

bcardosolopes commented Dec 20, 2024

[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

Are you sure you want to change the base?

[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

Conversation

PikachuHyA commented Dec 20, 2024

bcardosolopes Dec 20, 2024

Choose a reason for hiding this comment

bcardosolopes commented Dec 20, 2024